Structure of puromycin-sensitive aminopeptidase and polyglutamine binding

Puromycin-sensitive aminopeptidase (E.C. 3.4.11.14, UniProt P55786), a zinc metallopeptidase belonging to the M1 family, degrades a number of bioactive peptides as well as peptides released from the proteasome, including polyglutamine. We report the crystal structure of PSA at 2.3 Ǻ. Overall, the enzyme adopts a V-shaped architecture with four domains characteristic of the M1 family aminopeptidases, but it is in a less compact conformation compared to most M1 enzymes of known structure. A microtubule binding sequence is present in a C-terminal HEAT repeat domain of the enzyme in a position where it might serve to mediate interaction with tubulin. In the catalytic metallopeptidase domain, an elongated active site groove lined with aromatic and hydrophobic residues and a large S1 subsite may play a role in broad substrate recognition. The structure with bound polyglutamine shows a possible interacting mode of this peptide, which is supported by mutation.


Introduction
Bioactive peptides perform a variety of signaling functions [1,2]. As neuromodulators, they play a vital role in regulating activity throughout the nervous system, while in the periphery they maintain normal function in organ systems as well as modulate responses to environmental stimuli. A number of peptidases have been implicated in the control of bioactive peptide levels [1][2][3][4], and nearly all of these neuropeptidases utilize a zinc ion cofactor with catalytic domains structurally related to the well-characterized enzyme thermolysin [5]. A group of these peptidases belong to the MA clan, which is characterized by a HEXXHX 18 E active site sequence motif, where the two histidines and the distal glutamate coordinate the zinc ion cofactor. A water molecule that also coordinates the zinc acts as the attacking nucleophile in catalysis. It is stabilized by hydrogen bonding to the first glutamate of the motif, which also serves as a general base to abstract a proton from the water, facilitating nucleophilic attack.
Within clan MA, members of the M1 family are aminopeptidases characterized by a second conserved sequence, GAMENW, in which the glutamate residue has been proposed to interact with the amino terminus of substrates [6]. M1 member puromycin-sensitive aminopeptidase (PSA or NPEPPS, E.C. 3.4.11.14, UniProt P55786), named for its inhibition by the antibiotic puromycin, accounts for the major fraction of cytosolic aminopeptidase activity in various human tissues [7]. PSA was first identified based on its metabolism of opioid peptides [8][9][10], and it has been implicated in cell cycle control [11,12] development of cell polarity [13][14][15][16], and processing of peptides released by the proteosome [17] as well as being identified as a target for cancer therapies [18,19]. PSA deficient mice, known as Goku mice, have been generated by the gene trap method [20]. Both male and female Goku mice exhibit reproductive defects. The male Goku mice are infertile and lack copulatory behavior. Female Goku mice also show infertility due to impaired formation of the corpus luteum during pregnancy [21]. Deletion of the PSA orthologs in C. elegans and Drosophila also produces effects on reproduction and embryonic development [14,[22][23][24]. The Goku mice in addition exhibit compromised pain perception and increased anxiety, which may result from changes in the levels of circulating enkephalins [25].
More recently, PSA gene expression in various regions of mouse brain was shown to correlate with low levels of expressed mutant human Tau protein, and overexpression of PSA inhibited Tau induced neurodegeneration in a Drosophila model [26]. A direct role for PSA in metabolizing Tau has been proposed [26][27][28][29], although PSA cannot degrade Tau in vitro [30].
Thus, its precise role in Tau regulation remains unknown. PSA performs the important function of degrading polyglutamine peptides released by proteasomes [31], and loss of PSA activity results in increased aggregation and toxicity of polyglutamine expanded Huntingtin exon 1 in cultured cells and muscle [32]. Loss of PSA also affects SOD1 abundance and clearance, suggesting a neuroprotective role in amyotrophic lateral sclerosis [33]. PSA is therefore particularly associated with neurodegenerative disorders as well as normal peptide metabolism and reproductive function.
We present here a crystal structure of PSA that defines the architecture of the enzyme and the mechanism underlying its restriction to exopeptidase activity. The structure also suggests a basis for the enzyme's ability to degrade a broad range of peptide sequences. In addition, we define the binding path of a polyglutamine peptide in the active site of the enzyme.

Production of PSA
Human PSA for crystallization was expressed in insect cells using the BAC-TO-BAC system (Invitrogen) as described previously [34,35]. The coding sequence for PSA was introduced into the pFASTBAC-HT(B) intermediate vector, which codes for a polyhistidine affinity tag and a TEV protease site on the N-terminus of the protein. Protein was produced by culturing suspended Sf9 insect cells in sf-900 II serum free medium (Gibco BRL) at 27˚C.
Expressed PSA was purified by anion exchange chromatography using POROS HQ resin (GE Healthcare) with the sample and resin initially equilibrated with 20 mM Tris (pH 7.4). The enzyme was eluted with a gradient of increasing NaCl concentration ranging from 0.1 M to 1 M. Peak fractions corresponding to PSA were concentrated and run through a molecular sieving column (Sephadex G50) equilibrated with 20 mM Tris (pH 7.4). Fractions containing PSA were pooled, dialyzed against 10 mM HEPES buffer (pH 7.0), 2.0 mM BME and concentrated to 5-8 mg/ml of apparently homogenous enzyme. The N-terminal sequence was not removed for crystallization trials. Before obtaining kinetic data for PSA, we switched to producing the enzyme in insect cells with a C-terminal polyhistidine sequence as previously described [30], and a metal affinity chromatography purification step was substituted for anion exchange chromatography. The F433A mutant was generated by PCR mutagenesis with this expression construct and sequenced to verify the change.

PSA crystallization and structure determination
PSA was crystallized by hanging drop vapor diffusion with initial conditions defined using commercially available solution screens (Hampton Research; Molecular Dynamics Ltd.). High quality crystals were obtained reproducibly at 16˚C using 5-8 mg/ml PSA mixed 1:1 with 15% PEG 4K, 0.1 M Tris (pH 8.5), 0.5 M sodium chloride, and 1.5% v/v dioxane. PSA crystals generally grew to full size in two weeks.
Crystals were prepared for X-ray data collection by serial transfer through solutions containing the crystallization components plus glycerol at concentrations increasing from 5 to 30% in 5% steps. Crystals were soaked in each solution for approximately 10 minutes. The crystals were then mounted in nylon loops and flash cooled by plunging into liquid nitrogen [36]. Data were collected on beamline 22ID at the Advanced Photon Source, Argonne National Laboratory and processed with HKL2000 [37]. The structure was determined by molecular replacement using tricorn interacting protease factor F3 (PDB ID 1Z1W) [

Polyglutamine peptide purification
A commercially prepared (Peptidogenic Research and Co., Livermore, CA) polyglutamine peptide (PQ) with the sequence Lys 2 Gln 15 Lys 2 was purified with slight modifications to a published protocol [42]. 5 ml of trifluoroacetic acid and 5 ml of hexafluoro-2-propanol (TCI America) was added to a glass vial containing 4 mg of the PQ peptide. The mixture was vortexed intermittently for 2 minutes and left overnight at room temperature. The solvent was evaporated over a period of one hour using a gentle stream of argon gas, and then placed on a lyophilizer for half an hour to remove any residual solvent. Then 1 ml water (made to pH 3.0 with TFA) was added to the sample. This sample was spun at 50,000 x g for 3.5 hours at 4˚C to separate any remaining aggregated material. The top two thirds of the supernatant was recovered and flash frozen in liquid nitrogen for later use.

PSA-PQ complex
PSA protein crystals were grown as described above. The crystals were transferred into a solution containing 1mM EDTA, 0.1 M Tris (pH 8.5), and 0.5 M sodium chloride and left to soak for 1 hour. These crystals were then transferred for 45 min into a solution containing the crystallization conditions plus 1mM EDTA and 200 μM PQ peptide. The crystals were flash cooled by transferring them briefly to a cryosolution containing crystallization conditions plus 1mM EDTA, 200 μM PQ peptide and 20% glycerol and then plunging loop mounted crystals into liquid nitrogen. Data were collected and processed as described above, resulting in a complete 3.65 Å data set. Difference maps were generated using the Phenix package [39] with the unliganded PSA structure, and poly-alanine was modeled into the observed difference density with COOT [41] using manual building and real space refinement for the peptide only. Subsequently, the polyalanine peptide was converted to polyglutamine, and the PSA-PQ complex was refined in Phenix using positional restraints to the unliganded PSA structure because of the limited data resolution. Since there was generally no convincing electron density for the side chains beyond the beta carbons, the peptide was converted to polyalanine for deposition after an additional round of refinement.

Kinetic analysis
The kinetic properties of PSA were determined using the fluorogenic substrate alanine 4-methoxy-β-naphthylamide (Ala-4MβNA) in 20mM HEPES pH7.0 and 2 mM BME at 37˚C [10]. Release of free naphthylamide was monitored on a fluorescent plate reader (SpectraMax Gemini XS) at an excitation wavelength of 335 nm and an emission wavelength of 410 nm. In particular, estimates of the K i values for the PQ peptide and dynorphin A(1-17) were determined based on inhibition of Ala-4MβNA cleavage [10] at increasing concentrations of either peptide. Reactions were carried out with and 20 μM (PSA wt ) or 100 μM (PSA F433A ) Ala-4MβNA in a total volume of 200 μl, with reaction rates measured in triplicate. Single reciprocal plots of the inhibition data were fit by linear regression using the Prism software package (GraphPad Prism), with the X intercept representing K i (1+[S]/K m ) where [S] is the concentration and K m is the Michaelis constant of Ala-4MβNA. K m values for Ala-4MβNA were 29.2 μM for PSA wt and 160 μM for PSA F433A .

PSA architecture
Data and model statistics for crystal structures of PSA and PSA with bound peptide are provided in Table 1. PSA adopts a lopsided V-shape overall conformation, creating a central groove that is about 20 Ǻ long and 15 Ǻ wide (Fig 1; secondary structure versus sequence and residue number shown in Fig 2). The longer arm of the V consists of residues from the N terminus through residue 594, while the shorter, C-terminal arm comprises residues 595-914. The overall architecture is similar to other M1 family peptidases with known structures, including tricorn interacting factor F3 [38], aminopeptidase N from Escherichia coli (ePepN) [43,44], aminopeptidase from Plasmodium falciparum (Pfa-M1) [45], and the smaller leukotriene A4 hydrolase (LTA4H) [46]. More recently, the human endoplasmic reticulum aminopeptidases (ERAP1 and ERAP2) have also been shown to share the same overall fold [47][48][49]. The N-terminal arm of PSA can be further divided into 3 distinct regions: N-terminal domain (domain I, residues 51-253; residues 1-50 not ordered in the crystal structure), catalytic domain (domain II, residues 254-503), and linker domain (domain III, residues 504-594). This arrangement is primarily defined by the presence of the central metallopeptidase domain, which has strong structural similarity to the bacterial metalloprotease thermolysin [50,51].
Domain I of PSA is dominated by a large beta sheet that forms the end of the longer arm of the molecule (Fig 3A). The eight strands in this sheet are arranged in a mixed orientation and adopt a saddle shaped structure. Three smaller sheets of three, two, and two strands, tuck under the ends of the saddle, where they form part of the interface with the catalytic domain. The eight-stranded sheet consists of beta strands 1, 2, 4, 5, 8, 11, 13 and 14. The three-stranded sheet (strands 3,6 and 7) has antiparallel strands with one solvent exposed face, and the two stranded sheets (strands 9 and 10 and strands 12 and 15) are nearly parallel and in the same plane at the other end of the saddle. These two smaller sheets form a beta sandwich element with a sheet from the catalytic domain, making extensive contacts at the interface. The N-terminal domain of PSA appears to be unique to the M1 aminopeptidases, with no strong similarities detected to structures outside the family.
The catalytic domain (domain II) is composed of a mixed β-sheet and a large helical cluster consisting of helices 3-14 with a small two-stranded parallel sheet in addition (Fig 3B). The mixed sheet consists of strands 16-20, and it forms a large part of the interface with the N-terminal region. All zinc metallopeptidases with known structures that function as neuropeptidases have a conserved thermolysin-like [52-55] active-site fold [56], and the active site domain of PSA superimposes on thermolysin (PDB ID 1L3F) with an r.m.s.d. of 5.2 Å over 216 Cα atoms out of 251 residues. The active site itself contains the conserved motif HEXXH, which is present in α5. As in other metallopeptidases, the two histidine residues (His352 and His356) of the motif coordinate the zinc ion, and the glutamate residue (Glu353) hydrogen bonds to a water molecule that is also coordinated to the metal (Fig 4). In addition, another glutamate residue (Glu375) present in α6 acts as a fourth zinc-coordinating group. This coordination of the zinc ion by two histidine residues and a downstream glutamate is characteristic of clan MA metallopeptidases. In PSA and other M1 aminopeptidases, the downstream glutamate is part of a conserved sequence motif, NEXFA [38,43,50,51]. Residues from helices 6, 8, 9,11 and 19 largely make up the floor of the active site. The active site walls are formed by the edge of the five-stranded sheet (strand 16), helices 5, 10, 21, 25, 28, 31 and 33 as well as loops made up of residues 151-159 and 327-340. While the active site is principally comprised of residues from the catalytic domain, the N-terminal domain contributes residues, including Gln178 and Glu180, and helices from the C-terminal domain (domain IV) form one side of the active site pocket. In some conformations of factor F3, an arginine residue (Arg721) from domain IV contacts an extended loop (residues 324-250) from the active site domain, particularly interacting with the side chain of Phe346 [38]. In PSA, the equivalent of Arg721 is Phe846, but the segment containing this residue (around the N terminus of α31) is shifted away from the catalytic domain by about 6 Å relative to factor F3, and no contacts are made. There are interactions, however, between the N terminus of α5, which contains the HEXXH motif, and the turn between helices 20 and 21 (particularly His700) in the Cterminal domain. Interactions also occur between helices 8 and 9 of domain II and α17 from domain IV. Although neither of these interactions is extensive, they likely contribute to maintaining the open conformation of the enzyme.
The linker domain (domain III) serves to make the connection between the N-and C-terminal portions of the enzyme (Fig 3C). This domain follows the metallopeptidase active site region and consists of two sheets containing strands 23, 24, 25, 28, and 30, and 26, 27 and 29 that are packed against each other to form an immunoglobulin-like β-sandwich fold. The smaller three-stranded sheet primarily makes contacts with the domain II (particularly α11 and α14), while the edges of both sheets interact with domain IV (α16 and α17). The interface with domain II buries 2010 Å 2 of solvent exposed surface, while the contact with domain IV is about the same size, burying 1990 Å 2 of exposed surface. Both domain-domain interfaces are largely hydrophobic and aromatic in nature, and it seems likely that they are rigid and stable.
The C-terminal domain IV forms the short arm of the V-shaped PSA molecule (Fig 3D). It consists of 18 helices arranged into two superhelical HEAT repeat segments [57]. The first six helices (α16-21) form one superhelical segment. Two subsequent helices (α22 and α23) serve to turn the path of the superhelix roughly 120˚, and the remaining ten helices (α24-33) form the second superhelical segment. The C-terminal helix (α33) of this second segment is elongated, and it interacts with the first superhelical segment to form a closed loop. Domain IV is known to be required for proper folding of the remainder of the molecule when expressed in E. coli [58]. The conformation of the entire domain is unique to the M1 peptidases, but as expected, a number of other proteins with HEAT repeats show structural similarity to the individual repeats of PSA, particularly the longer C-terminal repeat. The C-terminal domain in endoplasmic reticulum aminopeptidase 1 (ERAP1) has been shown to interact with the C terminus of a bound 15-residue peptide analog [59]. The interacting residues are not conserved in PSA, but other residues in the corresponding region, particularly Lys712 and Lys715, might mediate a similar interaction. More generally, domain IV of ERAP1 has been shown to mediate binding of the C termini of peptide-like inhibitors, as well as an allosteric effector, at a distributed set of sites, which can serve to modulate a large-scale conformational change in the enzyme [60,61]. PSA domain IV could play a similar role.
HEAT repeat superhelices often mediate protein-protein interactions [57]. PSA has been reported to co-localize with tubulin [11,62], and it is interesting to note that one of two putative microtubule associated protein (MAP) sequences present in PSA [11] is located within the first superhelical segment of the domain IV (residues 682-703; Fig 5). In Tau and a number of other proteins that interact with microtubules, MAP sequences help to mediate the binding interaction [63][64][65][66][67]. The other putative MAP motif in PSA is located in the catalytic domain (residues 266-289), where it comprises the C-terminal portion of β17, the following loop segment, and the N-terminal portion of α3. In contrast, the MAP motif in domain IV forms portions of two helices: the C-terminal segment of α20 and the first turn of α21, as well as the nine-residue intervening loop, which contains a sequence similar to the most conserved Pro-Gly-Gly-Gly sequence of the MAP motif. In MAP sequence containing proteins, the motif appears to be unstructured when not interacting with tubulin, but a portion of the sequence may form an additional strand or two of an α-tubulin sheet when bound to microtubules as seen in the crystal structure of a complex between tubulin and the MAP-related stathmin-like domain sequence [66,68]. While the PSA MAP sequence in domain II has only a relatively short loop segment that is not positioned well to interact with a large structure like a microtubule, the longer loop in the domain IV MAP sequence points into solvent in an orientation that would likely allow it to mediate an interaction with microtubules. In that regard, some proteins that promote tubulin polymerization interact via TOG domains, which are formed from HEAT repeats [69][70][71][72], and it is possible that other loop segments in the domain IV HEAT repeats contribute to an interaction with microtubules. The two MAP sequences in PSA also show similarity to a characteristic sequence motif in some proteasome subunits [11]. For example, the proteasome sequence occurs near the N-termini of the alpha subunits in the yeast 20S proteasome structure, forming an open coil segment and a short helix that make up part of the gate assembly of the proteasome [73]. While this structure of the motif is like the one adopted by the domain II MAP sequence, no functional significance of this similarity is suggested by existing knowledge of PSA activity.

Comparison of M1 aminopeptidases
The crystal structures of other aminopeptidases in the M1 family include: aminopeptidase A (APA) [74], tricorn interacting factor from Thermoplasma acidophilum (factor F3) [38], aminopeptidase N (APN) [75][76][77] [78], insulin regulated aminopeptidase (IRAP) [79,80], cold-active aminopeptidase from Colwellia psychrerythraea (ColAP) [81], aminopeptidase A from Legionella pneumophila (LePepA) [82] and aminopeptidase N from Deinococcus radiodurans (M1dr) [83,84]. Alignment of the human PSA sequence with these other M1 members indicates they are not closely related. Sequence identities range from 15-32% (similarity 24-49%), with most of the human paralogs (APA, APN, ERAP1, ERAP2, and IRAP), as well as factor F3 and AnAPN1, being at the high end of the range. Although the lopsided V-shaped overall architecture of PSA is maintained in the other aminopeptidases, most of the enzymes crystallize in a closed conformation where a portion of domain IV shifts to interact with the catalytic domain II, eliminating the gap between the two arms of the V (Fig 6). This conformational change primarily involves a rigid rotation of the domain IV second HEAT repeat (and the transition helix α22).
Several crystal forms of ERAP1 [47,48,61], two forms of APN [75,76], and factor F3 [38] also adopt the fully open conformation seen in PSA. (Unliganded IRAP is not open to the same extent as PSA, but domain IV is not in contact with domain II. Binding of an inhibitor causes it to adopt a fully closed conformation [85]) The enzymes in the closed conformation generally have been crystallized with bound inhibitors, peptides, or individual amino acids at the active site. However, it appears that crystallization may trap lower probability conformers, for example unliganded Pfa-M1 in the closed form [45] or ERAP1 bound to peptide-like inhibitors in the open form [48,61]. In the case of ERAP1, X-ray scattering and other studies convincingly demonstrate that substrate mimics shift the conformational equilibrium toward the closed form in solution [60,61], and this seems likely the case for at least most other M1 members. The AlphaFold Database prediction for human PSA adopts the closed conformation [86,87], which may be influenced by the abundance of closed conformation templates. Running AlphaFold Colab [86] without templates, however, also generates a prediction in the closed conformation, indicating coevolutionary restraints at the domain II-IV interface. The closed PSA model has an internal chamber with only narrow solvent openings. CASTp [88] reports a volume of 3510 Å 3 for this chamber, which would be sufficient to accommodate the largest reported PSA substrate, dynorphin A(1-17) with an excluded volume of 2023 Å 3 . Except for the shift of domain IV, the individual domains of all the M1 peptidases maintain the general structure seen in PSA with RMSD values on Cα superposition varying between 1. Notably, average atomic thermal factors are higher in PSA domain I and the second HEAT repeat of domain IV (see Table 1

Other residues involved in catalysis
Stabilization of the oxyanion generated in the transition state by zinc metallopeptidases generally involves not only the positively charged zinc ion but also one or two hydrogen bond donating side chains [50]. In thermolysin, His231 and Tyr157 likely donate hydrogen bonds in this manner. Tyr438 in PSA, which is in α11, occupies the position equivalent to His231 and could participate in transition state stabilization (Fig 7A). Mutating this residue to phenylalanine reduces k cat by1000 fold, indicating its importance in catalysis [35]. Tyr157 in thermolysin is in the loop between the active site helices. The most structurally equivalent residue in PSA is Trp367, which is conserved in a number of other M1 aminopeptidases. This residue is far (over 16 Å) from the active site zinc ion, however, and is unlikely to participate in catalysis. Tyr378 in LTA4H has been proposed to act as a second stabilizing residue [46]. This tyrosine is conserved in APN [43] but it is replaced by phenylalanine in both PSA and Factor F3 and therefore could not function to stabilize the oxyanion in these enzymes. Another tyrosine, residue 244, is in the vicinity of the active site in PSA where it could possibly participate in catalysis (see Fig 7A). It is located in the turn connecting strands 14 and 15 of the N-terminal domain, however the distance between its hydroxyl group and expected position of the carbonyl oxygen is over 8 Å, indicating that a conformational change would be needed for it to participate in transition state stabilization.

Aminopeptidase activity
Thermolysin and many other zinc metallopeptidases act as endopeptidases, cleaving peptide sequences internally. Their active sites allow bound substrate peptides to extend in either direction from the catalytic machinery. In PSA, however, the active site is closed off at one end by elements of the N-terminal domain (Fig 7B and 7C), restricting the extent of the peptide N-terminal to the cleavage site. Specifically, strands and turns from the second β sheet of that domain, as well as residues from the turns between helices 5 and 6 and helices 12 and 13 pack to form a structural wall that limits substrate binding to one amino acid N-terminal to the scissile bond. Thus, the active site channel of PSA resembles a blind canyon with the catalytic machinery located near its closed end. The aminopeptidase specific GAMENW sequence [43,44,89,90] encompasses one edge strand (β19; residues 316-321) of the five-stranded sheet in the catalytic domain and the following open coil segment. Glu319 from this sequence is well positioned to interact with the N-terminal amino group of bound peptides, as has been proposed [43,44]. Together with the nearby Glu180, it creates a pocket with strongly negative electrostatic potential that likely binds the N-terminus of substrate peptides, helping to position them appropriately for catalytic removal of the first residue. Near the active site, bound substrate peptide likely interacts with main chain groups of residues in strand 19, the edge of the catalytic domain central sheet, as expected for the binding of substrates to zinc metallopeptidases [56]. In particular, the carbonyl group of Ala317 likely accepts a hydrogen bond from the main chain amine of the P1 substrate residue. In addition, the main chain amine of Ala317 is in position to interact with the carbonyl group of the P1' substrate residue. The side chains of Val349, Ser379, and Glu382 are also in position to interact with the P1' residue depending on the path of the substrate as it exits the immediate active site region.
The active site of PSA lies at the end of a long groove in the enzyme, presenting a large surface that likely provides the basis for interaction with the extended portion of peptide substrates C terminal to the scissile bond (Fig 8). Interestingly, this potential substrate-binding surface in PSA is enriched in aromatic and hydrophobic residues, with some 36 solventexposed hydrophobic/aromatic residue side chains around the active site. The floor and sides

PLOS ONE
of the active site channel are lined with these hydrophobic/aromatic residues, and, in particular, residues from helices α5 and α6 make much of the floor of the channel where substrates likely interact.

Hydrolysis of polyglutamine peptides
Glutamine-rich sequences are found in many cellular proteins, including those associated with neurodegenerative disorders. The Huntingtin protein, for example, contains polyglutamine sequences [91] prone to expansion as a result of errors during DNA replication [92]. Expanded polyglutamine sequences tend to aggregate [93] and form inclusions that are the pathological hallmarks of neurodegenerative diseases like Huntington's disease and spinocerebellar ataxia [92,94]. Importantly, polyglutamine tracts are not degraded efficiently by the proteasome [95]. PSA is the only proteolytic activity in HeLa cells identified as being responsible for processing polyglutamine sequences, and the enzyme was able to degrade long polyglutamine peptides (20-30 glutamines) as efficiently as short polyglutamine containing peptides [31]. Moreover, PSA knockdown or inhibition increased polyglutamine accumulation and PSA overexpression had the opposite effect in other cell types [32]. To help identify specific residues in PSA that may have a role in polyglutamine turnover, we determined the crystal structure of PSA in complex with a 19-mer polyglutamine peptide (PQ) having the sequence Lys 2 -Gln 15 -Lys 2 .
Difference electron density in the active site region (Fig 9A) defined the binding site of the PQ peptide, and a polyalanine peptide was initially modeled into the low resolution density. The bound PQ is in position to interact with the glutamate residue (Glu319) of the GAMENW aminopeptidase recognition sequence in β19 as well as Glu319 from domain I. The peptide initially extends along strand 19, donating a hydrogen bond from the P1 residue to the carbonyl group of Ala317. The path of the peptide then turns, however, allowing it to interact with residues in helix 10 and the following loop, which includes Phe433. Fitting the backbone density in this manner results in a cis peptide bond between the P1 and P1' residues, but this unusual configuration may be a result of uncertainty in the build due to the low resolution of the density. Subsequent restrained refinement of the peptide converted to polyglutamine showed little side chain density with the exception of the P4' residue (Fig 9B). Nevertheless, the glutamine side chains can adopt favorable conformations with no major clashes, and their positions suggest potential interactions with the protein. The P1 sidechain may interact with Glu375 and Tyr438, and Gln178 is positioned to contact the side chain of the P1' residue. The side chain of the P3' residue may interact with Phe433 and the P4' side chain with nearby Asp430. The electron density after P5' becomes weak, preventing any further tracing of the substrate backbone path. In all, 6 alanine residues were included in the final peptide model. Interestingly, the electron density indicates that the zinc ion cofactor was present with high occupancy in the crystal despite the EDTA soak intended to remove it. Thus, the enzyme would have retained at least partial activity during peptide soaking, and it is likely that the bound fragment represents an average of partially degraded peptides of different lengths. The electron density is consistent with the carbonyl oxygen of the first substrate peptide bond coordinating the zinc ion in an orientation similar to that expected during hydrolysis.
No major changes in the overall conformation of PSA are evident from the electron density upon binding the PQ substrate. Since the peptide was soaked into PSA crystals, it is likely that lattice contacts in the crystal prevent a conformational change in PSA despite the presence of substrate at the active site.
As noted, the interaction with PQ appears to involve Phe433. To further assess the role of this residue, it was mutated to alanine (PSA F433A ) and the mutant protein produced for kinetic studies in comparison with wild type PSA. K i values were determined for the PQ peptide and a reference substrate, dynorphin A(1-17), by competitive inhibition of the fluorogenic substrate alanine 4-methoxy-β-naphthylamide (Ala-4MβNA) (Fig 10, data for graphs in S1 Table). Both PQ and dynorphin A(1-17) were found to be competitive with the fluorogenic substrates (Fig 10A and 10B). The apparent K i for the PQ peptide with wild type PSA was 1.3 μM, 95% CI [1.10-1.54] ( Table 2). The K i of PQ with the F433A mutant was found to be 4.9 μM, 95% CI [2.9-7.9], or 3.6 fold higher than wild type. Thus, mutating Phe433 reduces affinity for the PQ peptide, consistent with it playing a role in binding as indicated by the crystal structure. Interestingly, mutating Phe433 also decreased affinity for the reference peptide, dynorphin A (1-17). The K i with PSA F433A , 2.6 μM, 95% CI [2.38-2.82], was 5.9 fold higher than the K i with wild type PSA, 0.44 μM, 95% CI [0.38-0.51]. Either dynorphin A(1-17) interacts in a similar manner as PQ or mutating F433 has a more general effect on substrate binding.

Discussion
Most of the cytosolic aminopeptidase activity in the mammalian brain and likely other tissues is attributable to puromycin sensitive aminopeptidase (PSA). It was first purified from rat brain [96,97] and bovine brain [8] based on its cleavage of enkephalin. PSA orthologs have been found in a wide range of organisms, including plants [98], primitive eukaryotes [22,23] and amphibians [99], and the presence of clear orthologs across Eukarya suggests an essential  function for PSA. M1 aminopeptidases exist in Archaea [100] and bacteria [101], indicating that the family is of ancient origin. The work reported here shows a close structural similarity between PSA and these distantly related prokaryotic enzymes. As noted, in vitro and in vivo studies have identified a number of substrates for PSA [8,9,[102][103][104]. A key feature of PSA, therefore, is its ability to accommodate a number of substrate amino acid sequences in its active site. It is clear that, although preferences exist, various types of amino acids can be accommodated at any position relative to the cleavage site. While peptidases frequently do not have absolute specificities at particular positions, the ability to recognize such a broad range of seemingly unrelated sequences is often a characteristic of zinc metallopeptidases that metabolize bioactive peptides [3,[105][106][107][108].
The PSA structure suggests two factors that may contribute to the broad substrate specificity of the enzyme. In the related APN with bound bestatin [43], the position of the phenyl group of bestatin likely defines the S1 subsite of the enzyme. Interestingly, Met260, which forms part of the subsite, must change conformation in order to accommodate the bulky phenyl group [44]. LTA4H has an even larger tyrosine residue at the equivalent position [46]. On the other hand, PSA has a much smaller alanine residue (Ala315) at this site. The smaller residue in PSA opens up the site relative to the other aminopeptidases, suggesting that it may be even less selective at the P1 substrate position. In the APN complex, a glutamine residue is also present in the S1 subsite, and this residue is conserved in PSA (Gln178) and LTA4H. In factor F3, however, this residue is a histidine (His99) [38]. FactorF3 prefers negatively charged residues at the P1 position, and the substitution of the at least partially positive histidine for the polar glutamine at this position likely accounts for that preference. The nature of the residues in the likely S1 subsite of PSA, particularly the presence of the small Ala315 and the polar Gln178, may allow for a broad range of residues at substrate P1 position. In addition to these considerations at P1, the presence of many aromatic and hydrophobic residues around the putative S1 subsite and other regions near the active site may mediate broad specificity by allowing different substrates to interact with different portions of this flat, carbon-rich surface.
The effect on PQ binding in the F433A mutant serves to support that the electron density seen in the complex crystal structure does reflect the backbone path of the PQ peptide. In addition, the peptide acting as a competitive inhibitor of a small fluorogenic substrate is consistent with the catalytic binding mode observed in the crystal structure. Phenylalanine is generally conserved at the equivalent of position 433 in other M1 aminopeptidases, except in PfA-M1 and LTA4H where there is a conservative change to tyrosine. Structures of peptide analog inhibitors or peptides complexed with M1 family aminopeptidases align well in the active site, but the paths of the ligand backbones diverge as they extend toward what would be the C termini of bound substrates (Fig 11A). Interestingly, bestatin bound to APN extends in the direction of the PQ peptide bound to PSA, although the peptide mimic bestatin is in the opposite orientation, with its C-terminal carboxyl group coordinating the active site zinc ion. This, diversity of interactions, taken with the binding path of PQ reported here, supports the proposal that the surface near the active site can accommodate a number of substrate binding modes. Additional crystal structures of PSA with different bound peptide substrates will be needed to test this proposal. Since PSA in the crystals described here is likely constrained by lattice contacts to remain in the open conformation, ideally additional structures would be determined with pre-formed enzyme-substrate complexes to enhance relevance to interaction in solution.
The M1 family peptidases have been crystallized in two overall conformations differing by a hinge-like motion of the C-terminal domain IV relative to the long, N-terminal arm (domains I-III) of the V-shaped enzyme. In the majority of the crystal structures, the C-terminal domain is closed over the active site, interacting extensively with domain II, which restricts access and possible substrate binding modes. In contrast, PSA, Factor F3, and forms of ERPA1 and APN adopt open conformations, with domain IV rotated away from the N-terminal arm by about 40˚in most cases. It has not, however, been established whether all the M1 peptidases sample both open and closed conformations in solution (with perhaps different equilibrium distributions for the different peptidases). The observations that ERAP1 crystallizes in both conformations [47,48], and that different Factor F3 molecules in the crystal asymmetric unit show different rotations of domain IV [38] suggest that in at least in some cases the relative positions of the N-and C-terminal arms can vary dynamically. The consequences of this conformational dynamics for the range of substrate binding modes remain to be established.
Addlagatta and colleagues have suggested a binding mode for the PSA specific inhibitor puromycin based on the structure of puromycin bound to an inactive mutant of ePepN and docking to a closed form PSA homology model [112]. The nucleoside portion of the inhibitor interacts near the active site zinc ion and coordinating residues, while the remainder of the molecule extends toward helix 31 of domain IV. The open conformation PSA structure reported here was superimposed domain-by-domain on the AlphaFold PSA model to generate a model for the closed conformation of the enzyme. The puromycin binding mode suggested previously is largely compatible with this closed model (Fig 12A) and was used as the starting point for docking with ROSIE Ligand_docking [113][114][115]. Interestingly, the three lowest energy models showed similar positions and orientations for puromycin (see Fig 12A), The docked puromycin ligands adopt more compact conformations and move away from the active site toward the surface of domain IV relative to the puromycin binding mode proposed Hinge motion at the interface between PSA domains 1 and 2. Cα traces of domain 1 (red) and domain 2 (gold) from the crystal structure are shown superimposed on the trace of a structure from normal mode analysis (gray) using the NOMAD-Ref server [111]. Movement of domain 1 in the normal mode analysis relative to its position in the crystal structure can be seen as a shift of the gray trace toward the top of the figure. https://doi.org/10.1371/journal.pone.0287086.g011  [113][114][115]. Puromycin shown with yellow carbon atoms is from superposition of an inactive ePepN-puromycin complex reported by Addlagatta and colleagues [112] on the closed PSA model. That puromycin pose was used as the starting point for docking with the closed PSA model. The three lowest energy complexes are show with green, purple, orange carbons, corresponding protein side chains, and earlier. In this position, a number of PSA side chains are placed to interact with the ligand, primarily from helix 11 of domain II and helix 31 of domain IV (Fig 12B). Puromycin soaked into crystals of active ePepN showed hydrolysis products in the active site [112]. Superimposing that structure on the closed model of PSA shows that O-methyl-L-tyrosine (OMT) fits well into the S1 subsite of the PSA closed model (Fig 12C). The puromycin aminonucleoside (PAN) fragment, while bound to ePepN in an orientation different from its position during hydrolysis, is also not obstructed by any groups in the closed PSA model. Therefore, the structure affords no obvious reason why puromycin is not hydrolyzed to any great extent by PSA. Its functioning as a competitive inhibitor likely results from an unproductive, high affinity binding mode, like the one suggested by the docking study, that sterically restricts access to the active site.
PSA has been implicated in the metabolism of two proteins associated with protein aggregation disorders, Tau and superoxide dismutase [26-29,33]. In both cases, reports suggest that PSA may play a direct role in degrading these large substrates, and evidence has been presented for endopeptidase activity of PSA. However, a study using purified PSA and Tau failed to find a direct role of PSA in Tau degradation [30]. PSA has been reported to stimulate autophagy [32], and this may at least in part account for its effects on levels of Tau and superoxide dismutase, both of which have been shown to be degraded via macroautophagy as well as other mechanisms [116,117]. Alternatively, since PSA contains microtubule-binding sequences and has been shown to co-localize with tubulin [11,62,118], it is possible that it may influence Tau lifetime by increasing the proportion of protein not bound to microtubules. Localization to microtubules may also play a role in the function of PSA in meiotic cell division, where its absence causes defects in chromosome segregation, recombination, development of cell polarity, and cell cycle progression [22,98]. Here PSA peptidase activity may be required, since inhibitors reproduce at least some of the effects of gene knockouts.
Despite questions regarding direct degradation of Tau or SOD, it is useful to examine the PSA structure with regard to its potential activity on large substrates or at least interaction with proteins. The open conformation of PSA may allow loop segments from folded or partially folded proteins to enter the active site groove. The structural barrier at one end of the active site, however, makes it unlikely that a loop segment could bind in a productive manner. Since this barrier is largely composed of elements from the N-terminal domain of PSA, one possible mechanism for endolytic cleavage would be the N-terminal domain of PSA swinging away from the metallopeptidase active site region in a hinge like motion (Fig 11B). Such a movement of the N-terminal domain would open the closed end of the active site channel, allowing a protein loop segment to extend on either side of the active site for endolytic cleavage. Only a single backbone connection exists between the N-terminal and catalytic domains, and this connecting segment is in an open coil conformation. Therefore, this region might act as the hinge. In this model, the hinge-like conformational change would be a relatively rare event, consistent with the reported poor efficiency of large substrate degradation by PSA [27]. The interface between the N-terminal and catalytic domain is not predominantly hydrophobic, suggesting that exposing the surfaces would not be prohibitively unfavorable. In fact, the largest hydrophobic region at the interface is near the hinge region between the N-terminal and catalytic domains where it would not be greatly exposed by a hinge motion. In conclusion, the work reported here demonstrates the basis for aminopeptidase activity by PSA and suggests a mechanism for its broad substrate recognition. In addition, the path of polyglutamine substrates is defined, suggesting they may bind in a manner distinct from other peptide substrates.
Supporting information S1